Land use and land cover changes influence the land surface temperature and vegetation in Penang Island, Peninsular Malaysia

The ecological changes in vegetation and land of an area can be monitored and managed through the assessment of its past and present land use and land cover (LULC). In this study, we assessed the changes in the LULC of Penang Island between 2010 and 2021. We also determined the corresponding impacts on the land surface temperature (LST) and vegetation index in the form of normalized difference vegetation index (NDVI). Landsat-5 and Landsat-8 were selected for the study. The LULC types were classified using both supervised and unsupervised multivariate maximum likelihood techniques. The LULC change analysis revealed a considerable increase in the urbanized areas (45.71%), a slight increase in the forests (1.57%) and a sizeable reduction in the agricultural/herbaceous areas (− 33.49) of the city within the stipulated period. The urbanized areas were observed to have the highest LST in 2010 and 2021 (28.75–34.0 °C) followed by the bare land (29.76–29 °C). The increase in temperature could have been driven by the reduction in the greenness of the city coupled with the openness of vegetation cover. Similarly, strong positive correlations were observed between the LST and NDVI in the urbanized areas (R2 = 0.92), and bare lands (R2 = 0.86). We, therefore, hypothesize that urbanization is the main driver of the LULC changes on Penang Island.

In recent times, the assessments of land use and land cover (LULC) changes have been used in the monitoring and management of ecological changes in different parts of the world 1 . Land use refers to the different human activities on the land which result in changes in the vegetation structure, water bodies, soil, rocks and other natural resources of an area 2 . Having accurate knowledge of the LULC of a place enhances the proper management of the challenges associated with the land. Apart from this, knowledge of the past present and future LULC changes also enables an appropriate estimation of the socio-economic and environmental impacts of such changes 3 . These LULC changes are direct products of increased global human activities and urbanization 4,5 .
Human activities are more impactful on the vegetation cover of terrestrial ecosystems thereby leading to environmental changes at local, regional and global levels 6 . These environmental changes also include an increase in the surface temperature due to the transformation of vegetation covers to other land use forms such as bare surfaces, solid surfaces, and agricultural lands 7,8 . It has been predicted that the land surface temperature of most parts of the world, especially developing countries will increase geometrically due to the impacts of pollution and urbanization in the year 2050 9 . Among other factors, population increase and uncontrolled and improper management of changes in LULC of urban areas are contributors to global climate change which increased surface temperatures. Hence, the assessment of LULC changes in an area will enable the understanding of the degree and spatial extent of anthropogenic changes there 6 .
The land surface temperature (LST) has been described as very essential in the assessment of the earth's surface features including the LULC and others 10 . Many studies have proved the influence of LULC on the LST of different parts of the world using remote sensing techniques [6][7][8] . Satellite remote sensing and geographic information systems are viable tools used for investigating the intensity of human impacts on the ecosystems through the mapping of LULC changes and vegetation indices within a stipulated period 11 . Remote sensing makes it easier and more economical to access data on vegetation and LULC changes in an area at a specific time 12 . These spatial data can then be managed and analyzed accurately using GIS techniques 13 . Landsat sensors such as Landsat 5 Thematic Mapper (TM), Landsat 7 Enhanced Thematic Mapper (ETM) and Landsat 8 Operational Land Imager (OLI) have been used to assess LULC and vegetation indices across the world 14 .
Studying the LST of an area could supply useful information on the human survival of such an area 15 . It could also provide information on the survival of crops because extreme climatic conditions negatively influence the growth, survival and productivity of crops 16 . Landsat data such as Landsat 8 has been very instrumental to the study of LST at both local and regional scales 17 . On the other hand, the normalized difference vegetation index (NDVI) has been used to measure the presence and dynamics of vegetation such as the green leaf area index, vegetation cover, green biomass and vegetation productivity 18 . It indicates the vegetation condition and predicts the productivity of plants in several areas of the world 19 . The NDVI works on the principle of electromagnetic radiation in which the greenness portion of the vegetation shows less reflectance in the visible spectrum because of the absorption of photosynthetic pigments. Consequently, it has a maximum reflectance in the near-infrared region 20 . In this study, the impact of LULC changes on the LST and vegetation of Penang Island between the years 2010 and 2021 was examined.
Penang Island was chosen considering that it houses the UNESCO world heritage centre (Georgetown). Georgetown has a rich and diverse cultural heritage; hence it was enlisted as a world heritage centre in 2008 21 . Over the years, this Island has witnessed lots of developmental transformations due to rapid industrialization and other anthropogenic influences 22 . It has been said that studying past events in the climate and land use change of an area will enhance proper deductions of the effects of such factors in the future 23 . Consequently, it became highly imperative for an assessment of the impacts of these anthropogenic changes on the vegetation of the famous city. Hence, the specific objectives include assessing the LULC changes in Penang over the 11 years; assessing the changes in LST and NDVI and; assessing the relationship between the LST and NDVI with regard to the LULC classes.

Results
The overall accuracy and Kappa coefficient for the year 2010 are 85.76% and 0.85 respectively while those for the year 2021 include 88.53% and 0.89 respectively (Table 1).
Regarding the producer's accuracy (2010 and 2021), all the LULC classes were greater than 70%. Similarly, the user's accuracies for all the LULC classes also exceeded 70%. This indicates that the classification of the LULC classes was achieved with very high accuracy. The results of the LULC classification of Penang Island (Table 2)  The analysis of the rate of change reveals a drastic increase in the urbanized areas which had a 45.71% increase over the 11 years of observation (Table 3). During this period, the urbanized area increased from 6111.62 to LULC impact on LST. The satellite images studied have been characterized for the land surface temperature regarding the LULC classes. The average LST values as influenced by the LULC are presented in Table 4.
In the year 2010, the bare lands and urbanized areas have the highest LST of 29.76 °C and 28.75 °C respectively (Fig. 3). Also in 2021, the urbanized areas exhibit the highest LST of 34.0 °C (Fig. 4). The forest lands have the lowest LST (23.60 °C) in 2010 which increased to 31.4 °C in 2021. Furthermore, the forest and agricultural lands

Relationship between LST and NDVI across LULC types. The relationship between the LST and
NDVI is shown in Table 5. As revealed by the linear regression analysis, the LST had negative correlations with the NDVI values of forests, agricultural lands and bare lands in the year 2010. In 2010, water bodies have the highest positive correlation between LST and NDVI (R 2 = 0.91), followed by the urbanized areas (R 2 = 0.53). The strongest negative correlation between LST and NDVI was observed in the agricultural lands (R 2 = 0.32) and followed by the forest lands (R 2 = 0.14). Interestingly, the forests and agricultural

Discussion
The use of a remote sensing approach in the assessment of the impacts of LULC changes on the LST and vegetation cover of an area is beneficial in enhancing appropriate land management decisions 9 . In this study, the agricultural (herbaceous) land and rocks experienced a drastic decrease in the land areas during these 11 years.
The loss of agricultural lands and rocks could have been gained by the urbanized areas. A similar study revealed that Penang has a track record of increased urbanization due to the vision of the State Government to make the State a renowned World Trade Centre 24 . This has led to a great influx of investors into the State, hence leading to the industrialization of the State. This Penang Island has already been described as the most developed part of the State comprising the international airport, factories, and many residential buildings 22 . Another study on land use and land cover changes in the capital city of Malaysia (Kuala Lumpur) revealed the same trend of rapid loss of green areas to urbanized areas over the last 15 years 9 . This high rate of urbanization did not only affect the vegetation cover of the city but also increased the pollution impacts on the city. Landsat data are known to be very useful in understanding the impacts of LULC on the LST of an area 9 . The increase in the LST of this study area is similar to the previous work which showed that the urbanized areas of Penang Island have the highest LST in 1999 and 2007 24 . A similar observation was recorded in other cities in which the urban areas had the highest LST 10,25 . This means that the change in the urbanization and bare lands of the Island has influenced the LST. This is also caused by the loss in the vegetation cover of the agricultural or herbaceous lands to materials such as concretes, stones and tars used for urbanization. The lower LST values exhibited by forests and agricultural lands are attributed to their contributions to the photosynthetic pool of the area, thereby reducing the heat 24 . Therefore, urbanization involving buildings incorporated with vegetation (green buildings) and less concrete structures has been suggested as a way of reducing the LST of an area 26,27 .
It has been reported that the strength of the correlation is revealed by the linear regression coefficient 28 . The negative correlations observed between the LST and NDVI at the LULC classes indicate that the higher the surface temperature, the lower the vegetation cover or biomass of those LULC types 24 and vice versa. Areas with high NDVI have been described as having enough vegetation cover which produces cooling effects thereby reducing the surface temperature 22 .

Conclusion
From this study, it has been revealed that Penang Island had a considerable increase in the urbanized areas and bare lands coupled with a greater loss in the green areas (particularly the agricultural/herbaceous lands). The forests in this city only had a slight increase in land area. This is somehow commendable as the city was able to maintain its forest lands despite the rapid urbanization. However, the loss of agricultural or herbaceous lands to urbanization is also worrisome. This is because the agricultural/herbaceous lands also had roles to play in ensuring the maintenance of the vegetation cover / photosynthetic productivity of the city. Also, urbanization can

Methods
Study area. Penang is situated in the northern part of Peninsular Malaysia and lies within the latitudes 5°12'N to 5°30' N and longitudes 100°09'E to 100°26'E (Fig. 7). Penang with a land area of 295 Km 2 , has an estimated population of 720,000 and is regarded as the most populated island in Malaysia. Penang shares the same border on the north and east with Kedah State and the south with Perak State. There are two main parts of www.nature.com/scientificreports/ Penang State: Penang Island and the mainland which is also regarded as Seberang Perai. These two parts of the State are connected by the two bridges. The eastern part of Penang Island is the most urbanized area comprising industries, commercial centres and residential buildings. However, the western part is less developed comprising mainly hilly terrain and forests 22 . This study is focused on the Island part of Penang. This island is endowed with a yearly equatorial climate (hot and humid). It has a mean annual temperature ranging between 27 and 30 °C while the mean annual relative humidity ranges between 70 and 90%. Also, the mean annual rainfall is about 267-624 cm.
Data acquisition. The flow chart of the methodology is presented in Fig. 8.    where E is land surface emissivity, PV is the Proportion of vegetation, 0.986 corresponds to a correction value of the equation.
Land Surface Temperature (LST). Land Surface Temperature (LST) is the radiative temperature which is calculated using top of atmosphere brightness temperature, the wavelength of emitted radiance and land surface emissivity.
Here c2 is 14388. The value of λ for Landsat 5 (Band 6) is 11.5 µm and Landsat 8 (Band 10) is 10.8 µm.    Image classification. The unsupervised method involving a random assignment of sample training points and supervised methods of satellite image classification was employed in this study for determining the LULC types. This mixture of image classification methods has been reported as vital in achieving a high accuracy level 33 . Bands 5, 4 and 3 were used to classify Landsat 8 while bands 4, 3 and 2 were used for classifying Landsat 5. We used the extraction by mask in the spatial analyst tool of ArcMap 10.2.1 software to extract the study area from the selected bands of the Landsat satellite images. A widely used supervised image classification method was adopted for classifying the Landsat bands in this study 32,34 . The principle of operation of this method involves the identification of known sample training points which are then used to classify other unknown points with related spectral signatures 35 . The three monochromatic satellite bands were combined to produce the false colour composite (FCC) using the data management tool 36 . This involves drawing polygons on the LULC type to select the training points. The LULC types adopted for this study include urbanized areas, agricultural land, rocks, forests, bare surfaces, and water bodies. These were modified LULC types from IPOC Good Practice Guidance 37 . To achieve this, a minimum of 40 sample points were selected randomly for each category of LULC type 36 . Having prior knowledge of the study area assisted in the selection of the training points 38 .  www.nature.com/scientificreports/ The multivariate maximum likelihood classification (MLC) technique was used for transforming the images. Other image transformation techniques have been used by researchers. These include the fuzzy set classifier, neural networks (NN) classifier, extraction and classification of homogenous objects (ECHO) classifier, per-field classifier, sub-pixel classifier, decision trees (DTs), support vector machines (SVMs), minimum distance classifier (MDC) and so-on 39 . The adoption of any of these techniques is dependent on the knowledge of the area of study, band selection, accessibility of data, the complexity of the landscape, the classification algorithm, and the proficiency of the analyst 39 . We preferred MLC to other techniques in this study due to its reported high level of accuracy in tropical regions 32,34 . Another reason for choosing MLC is that it is readily incorporated in many widely used GIS software packages. This MLC algorithm operates based on assigning pixels to the highest probability class and establishing the class ownership of such pixels. It is also regarded as a parametric classifier whose data follows almost a normal distribution 39 . We ensured the accuracy of this classifier by assigning a large number of training sample points using our prior knowledge of the study area.
Description of the LULC categories. The urbanized area is the developed part of the study area. This includes houses, roads, railways, and industries. This is also known to be a settlement in other literature 40 . Agricultural land is the part of the study area dominated by agricultural activities and herbaceous plants and grasses. Agricultural land is generally a product of deforestation 36 . Rocks are part of the study area comprising solid mineral materials (rocks). Bare land is the bare soil which is either made open by natural or human activities.
Forests are parts of the study area dominated by trees. They can be primary or secondary forests depending on the rate of disturbances. According to 41 , forest land is an area having more than 0.5 ha of flora comprising trees (height is above 5 m) with a canopy greater than 10%. The forests in Penang are generally both primary and secondary 42 . Water bodies are parts of the study area covered by water seasonally or permanently. These include seas, rivers, lakes, ponds, streams, or reservoirs 40 .
Determination of change in the LULC. The rate and extent of change in the LULC of Penang within the periods under consideration were determined following the formula below 43 : where T a means the total area.

Determination of relationship between LST and NDVI.
The values of LST and NDVI at 20 random points of each LULC class were used. The relationship between the LST and NDVI across all the LULC classes in each year was determined using the bivariate linear regression analysis. This was done in Paleontological Statistical (PAST) package 3.0.
Classification accuracy assessment. The classification accuracy was assessed by taking ground truth coordinate data of the LULC of the study area using a geographical positioning system (GPS) device (Garmin Etrex 10). These data were compared with the LULC classified in this study 32 . Consequently, an error matrix was generated. This normally uses ground truth data to explain the accuracy of the classified LULC. The error matrix comprises the user's accuracy, the producer's accuracy, overall accuracy and the Kappa index 32 .
The producer's accuracy (omission error) represents the probability of the correctly classified reference pixel and it is determined using this formula below: Also, the user's accuracy (commission error) represents the probability that the classified pixel matches the one on the ground 36 and it is determined using the formula below: The statistical accuracy of the matrix was determined using the Kappa coefficient 44 . This Kappa coefficient ranges from − 1 to + 1 45 . Therefore, the overall accuracy of the classification was determined by dividing the total number of correctly classified pixels by the total number of sampled ground truth data 40 .

Data availability
The datasets used and/or analysed during the current study available from the corresponding author on reasonable request.